Graph neural networks (GNN) have become the default machine learning model for relational datasets, including protein interaction networks, biological neural networks, and scientific collaboration graphs. We use tools from statistical physics and random matrix theory to precisely characterize generalization in simple graph convolution networks on the contextual stochastic block model. The derived curves are phenomenologically rich: they explain the distinction between learning on homophilic and heterophilic graphs and they predict double descent whose existence in GNNs has been questioned by recent work. Our results are the first to accurately explain the behavior not only of a stylized graph learning model but also of complex GNNs on messy real-world datasets. To wit, we use our analytic insights about homophily and heterophily to improve performance of state-of-the-art graph neural networks on several heterophilic benchmarks by a simple addition of negative self-loop filters.
translated by 谷歌翻译
疟疾是一种威胁生命的疾病,影响了数百万。基于显微镜的薄膜评估是(i)确定疟疾物种和(ii)定量高寄生虫感染的标准方法。通过机器学习(ML)对疟疾显微镜的完全自动化是一项具有挑战性的任务,因为预先准备的滑动在质量和表现方面差异很大,并且伪像通常超过相对较少的寄生虫。在这项工作中,我们描述了一个用于薄膜疟疾分析的完整,完全自动化的框架,该框架应用了ML方法,包括卷积神经网(CNN),该方法在大型且多样化的田间预先准备的薄膜数据集中进行了训练。定量和物种鉴定结果几乎足够准确地满足了耐药性监测和临床用例的混凝土需求。我们将方法和性能指标集中在现场用例要求上。我们讨论了将ML方法应用于疟疾显微镜的关键问题和重要指标。
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
The current optical communication systems minimize bit or symbol errors without considering the semantic meaning behind digital bits, thus transmitting a lot of unnecessary information. We propose and experimentally demonstrate a semantic optical fiber communication (SOFC) system. Instead of encoding information into bits for transmission, semantic information is extracted from the source using deep learning. The generated semantic symbols are then directly transmitted through an optical fiber. Compared with the bit-based structure, the SOFC system achieved higher information compression and a more stable performance, especially in the low received optical power regime, and enhanced the robustness against optical link impairments. This work introduces an intelligent optical communication system at the human analytical thinking level, which is a significant step toward a breakthrough in the current optical communication architecture.
translated by 谷歌翻译
生成高质量的艺术肖像视频是计算机图形和愿景中的一项重要且理想的任务。尽管已经提出了一系列成功的肖像图像图像模型模型,但这些面向图像的方法在应用于视频(例如固定框架尺寸,面部对齐的要求,缺失的非种族细节和缺失的非种族细节和缺失的要求)时,具有明显的限制。时间不一致。在这项工作中,我们通过引入一个新颖的Vtoonify框架来研究具有挑战性的可控高分辨率肖像视频风格转移。具体而言,Vtoonify利用了Stylegan的中高分辨率层,以基于编码器提取的多尺度内容功能来渲染高质量的艺术肖像,以更好地保留框架细节。由此产生的完全卷积体系结构接受可变大小的视频中的非对齐面孔作为输入,从而有助于完整的面部区域,并在输出中自然动作。我们的框架与现有的基于Stylegan的图像图像模型兼容,以将其扩展到视频化,并继承了这些模型的吸引力,以进行柔性风格控制颜色和强度。这项工作分别为基于收藏和基于示例的肖像视频风格转移而建立在Toonify和DualStylegan的基于Toonify和Dualstylegan的Vtoonify的两个实例化。广泛的实验结果证明了我们提出的VTOONIFY框架对现有方法的有效性在生成具有灵活风格控件的高质量和临时艺术肖像视频方面的有效性。
translated by 谷歌翻译
负责任的AI被广泛认为是我们时代最大的科学挑战之一,也是释放AI市场并增加采用率的关键。为了应对负责任的AI挑战,最近已经发布了许多AI伦理原则框架,AI系统应该符合这些框架。但是,没有进一步的最佳实践指导,从业者除了真实性之外没有什么。同样,在算法级别而不是系统级的算法上进行了重大努力,主要集中于数学无关的道德原则(例如隐私和公平)的一部分。然而,道德问题在开发生命周期的任何步骤中都可能发生,从而超过AI算法和模型以外的系统的许多AI,非AI和数据组件。为了从系统的角度操作负责任的AI,在本文中,我们采用了一种面向模式的方法,并根据系统的多媒体文献综述(MLR)的结果提出了负责任的AI模式目录。与其呆在道德原则层面或算法层面上,我们专注于AI系统利益相关者可以在实践中采取的模式,以确保开发的AI系统在整个治理和工程生命周期中负责。负责的AI模式编目将模式分为三组:多层次治理模式,可信赖的过程模式和负责任的逐设计产品模式。这些模式为利益相关者实施负责任的AI提供了系统性和可行的指导。
translated by 谷歌翻译
时间动作本地化旨在预测未修剪长视频中每个动作实例的边界和类别。基于锚或建议的大多数先前方法忽略了整个视频序列中的全局本地上下文相互作用。此外,他们的多阶段设计无法直接生成动作边界和类别。为了解决上述问题,本文提出了一种新颖的端到端模型,称为自适应感知变压器(简称apperformer)。具体而言,Adaperformer探索了双支球多头的自我发项机制。一个分支会照顾全球感知的关注,该注意力可以模拟整个视频序列并汇总全球相关环境。而其他分支集中于局部卷积转移,以通过我们的双向移动操作来汇总框架内和框架间信息。端到端性质在没有额外步骤的情况下产生视频动作的边界和类别。提供了广泛的实验以及消融研究,以揭示我们设计的有效性。我们的方法在Thumos14数据集上实现了最先进的准确性(根据map@0.5、42.6 \%map@0.7和62.7 \%map@avg),并在活动网络上获得竞争性能, -1.3数据集,平均地图为36.1 \%。代码和型号可在https://github.com/soupero/adaperformer上找到。
translated by 谷歌翻译
我们定期考虑在实践中回答反事实问题,例如“糖尿病患者会选择另一种药物,会更好吗?”。观察性研究在回答此类问题的显着性上增长,因为它们的广泛积累和比随机对照试验(RCT)比较容易获得的。最近,一些作品将表示和域的适应性引入了反事实推断。但是,大多数目前的作品都集中在二进制治疗的设置上。他们都没有认为不同治疗的样本量不平衡,尤其是由于固有的用户偏好,某些治疗组中的数据示例相对有限。在本文中,我们为反事实推断设计了一种新的算法框架,从元学习来估算单个治疗效果(元地铁)以填补上述研究空白,尤其是考虑多种不平衡治疗方法。具体而言,我们将反事实推断的治疗组之间的数据发作视为元学习任务。我们从一组有足够样品的源治疗组中训练一个元学习者,并通过梯度下降进行梯度下降,而在目标治疗中样本有限。此外,我们引入了两个互补的损失。一个是多种来源治疗的监督损失。提出了与各个治疗组之间潜在分布对齐的另一个损失,以减少差异。我们在两个现实世界数据集上执行实验,以评估推理准确性和概括能力。实验结果表明,模型元地铁匹配/跑赢大的方法。
translated by 谷歌翻译
大规模数据集在面部生成/编辑的最新成功中扮演着必不可少的角色,并显着促进了新兴研究领域的进步。但是,学术界仍然缺乏具有不同面部属性注释的视频数据集,这对于与面部相关视频的研究至关重要。在这项工作中,我们提出了一个带有丰富面部属性注释的大规模,高质量和多样化的视频数据集,名为高质量的名人视频数据集(CelebV-HQ)。 Celebv-HQ至少包含35,666个视频剪辑,分辨率为512x512,涉及15,653个身份。所有剪辑均以83个面部属性手动标记,涵盖外观,动作和情感。我们对年龄,种族,亮度稳定性,运动平滑度,头部姿势多样性和数据质量进行全面分析,以证明CelebV-HQ的多样性和时间连贯性。此外,其多功能性和潜力在两个代表性任务(即无条件的视频生成和视频面部属性编辑)上得到了验证。此外,我们设想了Celebv-HQ的未来潜力,以及它将带来相关研究方向的新机会和挑战。数据,代码和模型公开可用。项目页面:https://celebv-hq.github.io。
translated by 谷歌翻译
Federated learning is growing fast in academia and industries as a solution to solve data hungriness and privacy issues in machine learning. Being a widely distributed system, federated learning requires various system design thinking. To better design a federated learning system, researchers have introduced multiple patterns and tactics that cover various system design aspects. However, the multitude of patterns leaves the designers confused about when and which pattern to adopt. In this paper, we present a set of decision models for the selection of patterns for federated learning architecture design based on a systematic literature review on federated learning, to assist designers and architects who have limited knowledge of federated learning. Each decision model maps functional and non-functional requirements of federated learning systems to a set of patterns. We also clarify the trade-offs in the patterns. We evaluated the decision models by mapping the decision patterns to concrete federated learning architectures by big tech firms to assess the models' correctness and usefulness. The evaluation results indicate that the proposed decision models are able to bring structure to the federated learning architecture design process and help explicitly articulate the design rationale.
translated by 谷歌翻译